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ON THE FRACTAL NATURE OF INTERNET 


Radu DOBRESCU', Roland ULRICH? 


Rezumat. Lucrarea analizeaza douad cai de a demonstra natura fractala a 
internetului. In primul rand, prezintad autosimilaritatea traficului pe Internet si 
propune un model fractal pentru acesta. In al-doilea rand, propune un model 
independent de scara a topologiei Internet-ului. In continuare, autorii demonstreaza 
ca structura fractala a topologiei influenteaza comportarea fractala a traficului pe 
Internet. Aceasta afirmatie este sustinuta prin cdteva rezultate experimentale. 


Abstract. The paper analyses two ways to demonstrate the fractal nature of 
Internet. First, it presents the self-similarity of the Internet traffic and proposes a 
fractal model of this traffic. Secondly, it proposes a scale-free model of the Internet 
topology. Furthermore, the authors demonstrate that the fractal structure of the 
topology influences the fractal behaviour of the Internet traffic. This assertion is 
sustained by some experimental results. 
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1. Introduction 


Traffic flowing through the telecommunication networks in the pre-internet age 
was predominantly ‘voice’. The number of calls arriving at a station, namely the 
counting process, approximated a Poisson or renewal process. In either case 
arrivals were memory less in the Poisson case, or memory less at renewal points, 
and interarrival intervals were exponentially distributed. The Poisson arrival 
model and exponentially distributed holding time model allowed analytically and 
computationally simple Markov chains to be used for much of the telephone 
traffic modeling. An M/M/1/K chain can be used to accurately model a single 
server finite queue system with exponential service and Poisson arrivals yielding 
closed form solutions for queue length distribution, waiting time distribution, 
blocking probability etc. 


Internet traffic, which behave very differently from such simple Markovian 
models. Traffic measurements made at the Local Area Networks (LAN) and Wide 
Area Networks (WAN) suggest that traffic exhibits variability (traditionally called 
‘burstiness’) over multiple time scales [1].. The second order properties of the 
counting process of the observed traffic displayed behavior that is associated with 
self-similarity, multi-fractals and/or long range dependence (LRD). This indicates 
that there is a certain level of dependence in the arrival process. 
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Near-range and long-range dependencies often manifest themselves in a network 
by causing frequent and irremediable packet losses and other serious effects in the 
network. 


Dependencies and burstiness in traffic hence brought in an enormous amount of 
attention from researchers. They attempted to develop mathematically-based 
models that would help explain the nature of the systems exhibiting such 
phenomena and provide critical insight.into.the actual mechanisms that led to this 
behavior. Models like fractional Brownian motion, chaotic maps etc. were suited 
to capture the second order self-similar behavior of traffic [2]. Their results were 
difficult to get and harder to apply, and such models did not provide insight into 
the actual mechanism of traffic generation. Many analytically simpler modeling 
attempts to capture the first and second order properties of counts did not predict 
the queuing behavior well enough. In the late 90s researchers discussed the impact 
of other properties of the self-similar process, such as marginal distributions, in 
accurately predicting the queuing behavior. A simpler, more accurate and 
analytically tractable model that provides more physical insight into why they are 
meaningful on physical grounds would help the network designers produce more 
effective and efficient designs. 


Some ideas generated in the last decade offer promise towards crafting the model. 
Anderson and Nielsen [3] illustrated that continuous parameter Markov chains 
(cpMc) can model the dependencies in network traffic over multiple time scales; 
the advantage of such models is the availability of ready-made tools for analysis. 
Their model matched the second order properties of the self-similar process 
closely, but it was not sufficient for accurate prediction of queuing behavior. 
Grossglauser and Bolot [4] discussed both the importance of limiting the view to 
the finite range of time scales of interest, and the influence of marginal 
distributions in performance evaluation and prediction problems. From the above 
discussions, one can infer that both the second order and marginal properties of 
the process need to be matched for more accurate results. Salvador et al. [5] 
achieved some degree of success by using a fitting procedure that matched both 
the marginal distribution and auto covariance of the counting process, but a 
solution form that provides deep insight into the system was still missing. 


The Internet is a prime example of a self-organizing complex system, having 
grown mostly in the absence of centralized control or direction. In this network, 
information is transferred in the form of packets from the sender to the receiver 
via routers, computers which are specialized to transfer packets to another router 
“closer” to the receiver. A router decides the route of the packet using only local 
information obtained from its interaction with neighboring routers, not by 
following instructions from a centralized server. A router stores packets in its 
finite queue and processes them sequentially. 
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However, if the queue overflows due to excess demand, the router will discard 
incoming packets, a situation corresponding to congestion. A number of studies 
have probed the topology of the Internet and its implications for traffic dynamics. 
[6,7]. 


To efficiently control and route the traffic on an exponentially expanding Internet, 
one must not only capture the structure of current Internet, but allow for long-term 
network design. Until recently all Internet.topology generators provided versions 
of random graphs but in 1999 discovery of Faloutsos [8]: Internet is a scale-free 
network with a power-law degree distribution. Several contributors found that the 
Internet flow is strongly localized: most of the traffic takes place on a spanning 
network connecting a small number of routers which can be classified either as 
“active centers,” which are gathering information, or “databases,” which provide 
information. Experimental evidence for self-similarity in various types of data 
network traffic is already overwhelming and continues to grow. So far, 
simulations and analytical studies have shown that it may have a considerable 
impact on network performance that could not be predicted by the traditional 
short-range-dependent models. The most serious consequence of self-similar 
traffic concerns the size of bursts. Within a wide range of time-scales, the burst 
size is unpredictable, at least with traditional modeling methods. 


This is the point from which the authors of this paper assume that the traffic 
behavior is strong influenced and depends of the network free-scale structure. We 
have also demonstrated that the scale-free Internet model displays a number of 
properties that distinguishes it from random graphs: wiring redundancy and 
clustering, non-trivial eigenvalue spectra of the connectivity matrix and a scale- 
free degree distribution. 


2. Evidence of traffic self-similarity 
2.1. General considerations 


Using a number of experiments, the following results towards characterizing and 
quantifying the network traffic processes have been achieved: 


First, self-similarity is an adaptability of traffic in networks. Many factors are 
involved in creating this characteristic. A new view of this self-similar traffic 
structure is provided. This view is an improvement over the theory used in most 
current literature, which assumes that the traffic self-similarity is solely based on 
the heavy-tailed file-size distribution. 


Second, the scaling region for traffic self-similarity is divided into two timescale 
regimes: short-range dependence (SRD) and long-range dependence (LRD). 
Experimental results show that the network transmission delay (RTT time) 
separates the two scaling regions. 
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This gives us a physical source of the periodicity in the observed traffic. Also, 
bandwidth, TCP window size, and packet size have impacts on SRD. The 
statistical heavy-tailedness (Pareto shape parameter) affects the structure of LRD. 
In addition, a formula to quantify traffic burstiness is derived from the self- 
similarity property. 


Furthermore, studies of fractal traffic with multifractal analysis have given more 
interesting and applicable results. (1).At.large timescales, increasing bandwidth 
does not improve throughput (or network performance). The two factors affecting 
traffic throughput are network delay and TCP window size. On the other hand, 
more simultaneous connections smooth traffic, which could result in an 
improvement. of network efficiency. (2) At small timescales, traffic burstiness 
varies. In order to improve network efficiency, we need to control bandwidth, 
TCP window size, and network delay to reduce traffic burstiness. There are the 
tradeoffs from each other, but the effect is nonlinear. (3) In general, network 
traffic processes have a Hélder exponent a ranging between 0.7 and 1.3. Their 
statistics differ from Poisson processes. To apply this prior knowledge from traffic 
analysis and to improve network efficiency, a notion of the efficient bandwidth, 
EB, is derived to represent the fractal concentration set. Above that bandwidth, 
traffic appears bursty and cannot be reduced by multiplexing. But, below it, traffic 
is congested. An important finding is that the relationship between the bandwidth 
and the transfer delay is nonlinear. 


The past few decades have seen an exponential growth in the amount of data 
being carried across packet switched networks, and particularly the Internet. This 
growth has brought packet switched networks to the point where the amount of 
traffic being carried on them is expected to exceed that carried on traditional 
circuit switched technology in the very near future. Packet switched networks are 
not new. They have been around for over 30 years. During that time, a number of 
models for the traffic carried across them have also been proposed. Early attempts 
at odeling network traffic odelin on Markovian models, such as the Markov- 
Modulated Poisson Process (MMPP) [9]. Markovian models were familiar to 
teletrafficists, due to their long association with the odeling of telephony traffic, 
and have the advantage of being generally tractable. 


In recent analyses of traffic measurements, evidence of non-Markovian effects, 
such as burstiness across multiple time scales, long range dependence and self 
similarity; have been observed in a wide variety of traffic sources. As is clearly 
shown in [10, 11, 12], the performance of processes exhibiting these properties is 
radically different from that of the traditional models. Given the evidence of long 
range dependence and self-similarity in such a wide variety of sources, it is clear 
that any general model for data traffic must account for these properties. This has 
led to the development of a number of new models. 
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2.2. Testing a fractal traffic model 


Mandelbrot and his co-workers introduced an analogy between self-similar (SS) 
processes and fractal processes [8]. Referring directly to the incremental process 
X;; = X,-X;, he defines stochastic self-similarity as: 


H 
X10,004rt =P X10, 10+b Vto, t, Vr>O (1) 


Mandelbrot constructs his SS process (fractional Brownian motion, fBm) starting 
with two properties of the Brownian motion (Bm): it has independent increments 
and it is self-similar with Hurst parameter H= 0.5 


Denoting Bm. as B(t) and fBm as By(t), here is a simplified version of 
Mandelbrot’s definition of the fBm: By(0) = 0, H € [0,1] and 


1 0 
B, (t)= [[e-9"?-C 3)" Bs) onficg 5)!” qB(s) 
T(H+0.5) |<, (2) 


An SS process is called a long-range dependence (LRD) process if there are 
constants «<(0,1) and C > 0 such that 


eal 3) 


lim 2 =1 


where pl (k) is the autocorrelation of lag k. 


When represented in logarithmic coordinates, eq. (3) is called the correlogram of 
the process, and has an asymptote of slope —a. It is to note that there are SS 
processes which are not LRD and, conversely, there are LRD processes which are 
not SS. However, the fBm with H > 0.5 is both SS and LRD type. 


In his landmark paper [1], Leland at al. report the discovery of self-similarity in 
local area network (LAN) traffic, more precisely Ethernet traffic. To be precise, 
we note that all methods used in [1] (and in numerous papers that followed) detect 
and estimate LRD rather than SS. Indeed, the only “proof” offered for SS per se is 
the visual inspection of the time series at different time-scales. “Self-similarity” 
(actually LRD) has since been reported in various types of data traffic: LAN, 
WAN, Variable-Bit-Rate video, SS7 control, HTTP etc. Lack of access to high- 
speed, high-aggregation links, and lack of devices capable of measuring such 
links have until recently prevented similar studies from being performed on 
Internet backbone links. In principle, traffic on the backbone could be 
qualitatively different from the types enumerated above, due to factors such as 
much higher level of aggregation, traffic conditioning (policing and shaping) 
performed at the edge, and much larger round-trip-time (RTT) for TCP sessions. 
Actually, some researches have even claimed that aggregating Internet traffic 
causes convergence to a Poisson limit. 
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For reasons presented in the next section and based on the remarks that on shorter 
time scales, effects due to the network transport protocols are believed to 
dominate traffic correlations and on longer time scales, non-stationary effects 
such as diurnal traffic load patterns become significant, we disagree. 


In our tests we have simulated link speeds ranging from 10 Mbps to 622 Mbps, 
average bandwidths between 1.4 and 42 Mbps, minimum time-scale of lms (in 
only one instance — usually above 10-100ms), and at most 6 orders of magnitude 
for time-scales. The correlograms (see fig.1) shown that traffic considered specific 
for the Internet backbone is indeed asymptotically SS, and also reported a new 
autocorrelation structure for short lags. The autocorrelation function for short lags 
has the same power form as for long lags, i.e. p(k) U~ k , but the parameter a 
turns out to assume values which are significantly larger: a € [0.55, 0.71] for ke 
[50us, 10ms], compared to a € [0.1, 0.18] for ke[100ms, 500s]. 
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Fig. 1. Graphic representation of a network with 200 nodes 


The first plot in fig.1 shows the correlogram for the shortest time unit used in our 
analysis. Although the linear trend is clearly present, the dependence is too 
chaotic to be of much use. For the second plot, the bytes arrived are aggregated in 
0.4 ms time intervals, and the two slopes corresponding to the two values of a are 
easily seen. 
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The third is a variance-time plot — just another way of looking at LRD. The 
straight line corresponds to a Hurst parameter H = 0.5, so clearly the asymptote of 
the function represented has a larger slope (between 0.84 and 0.96, to be precise). 
This being an arrival process with an average speed of about 700 Mbps, the 
hypothesis that at high speeds the traffic becomes Poissonian (H — 0.5) is 
rejected. 


3. A scale-free internet model 


3.1. Scale free topology 


To model a distributed network environment like the Internet, it is necessary to 
integrate data collected from multiple points in a network in order to get a 
complete picture of network-wide view of the traffic. Knowledge of dynamic 
characteristics. is essential to network management (e.g., detection of 
failures/congestion, provisioning, and traffic engineering like QoS routing or 
server selections). However, because of a huge scale and access rights, it is 
expensive (sometime impossible) to measure such characteristics directly. To 
solve this, methods and tools for inferencing of unobservable network 
performance characteristics are used in large scale networking environment. A 
model where inference based on self similarity and fractal behavior can be applied 
is the scale free network. 


Scale-free networks are complex networks in which some nodes are very well 
connected while most nodes have a very small number of connections. An 
important characteristic of scale-free networks is that they are size independent, 
that is they preserve the same characteristics regardless of the network size N. 
Scale-free networks have a degree distribution that follows a power relationship, 
P(k) = k*\(-d), where the coefficient 4 may vary approximately from 2 to 3 for 
most real networks. Many real networks have a scale-free degree distribution, 
including the Internet. The algorithm used for the generation of the scale-free 
network topology is generating networks with a cyclical degree that can be 
controlled, in our case, approximately 4% of the added nodes form a cycle. 


The generated topology consists of three types of nodes: 

e Routers, defined as nodes with one or several links. Routers do not initiate 
traffic and do not accept connections. 

e Servers are defined as nodes with one connection but sometimes could have 
two or even three connections. Servers only accept traffic connections but do 
not initiate traffic. 

e Customers (end-users) defined as nodes that have only one connection, very 
seldom two connections. Customers initiate traffic connections towards 
servers at random moments but usually in a time succession. For our proposed 
model, we chose a 20:80 customers to servers ratio. 
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3.2. Scale-free network design algorithm 


Several models have been presented for the evolution of scale-free networks, each 
of which may lead to a different ensemble. The first suggestion was the 
preferential attachment model by Barabasi and Albert, which came to be known 
as the “Barabasi-Albert (BA)” model [13]. Several variants have been suggested 
to this model. One of them, known as the “Molloy-Reed construction” [14], which 
ignores the evolution and assumes only the degree distribution and no correlations 
between nodes, will be considered in the following. Thus, the site reached by 
following a link is independent of the origin. We designed and implemented an 
algorithm that generates those subsets of the scale-free networks that are close to a 
real computer network such as the Internet. Our application is able to handle very 
large collections of nodes, to control the generation of network cycles, and the 
number of isolated nodes. The application was written in Python being, as such, 
portable. It runs very fast on a decent machine (less than 5 minutes for 100.000 
nodes model). 


Network generation algorithm: 
1. set node count and A 
2. compute the optimal number of nodes per degree 
3. create manually a small network of 3 nodes 
4. for each node from 4 to node_count 
4.1. call add_node procedure 
4.2. while adding was not successful 
4.2.1. call recompute procedure 
4.2.2. call add_node procedure 
5. save network description file 


add_node procedure 

1. according to the preferential attachment, compute the degree of the parent node 

2. if degree could be chosen then exit procedure 

3. compute the number of links that the new node shall establish with descendants 
of its future parent, according to copy model 

. chose randomly a parent from the nodes having the degree as computed above 

. compute the descendant_list, the list of descendants of the newly chosen parent 

. create the new node and links 

. for each descendant of the descendant_list create the corresponding links 

. exit procedure with success code 


_ 
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recompute procedure 

1. for each degree category 
1.1. calculate the factor needed to increase the optimal count of nodes per degree 
1.2. if necessary increase the optimal number of nodes per degree 

2. exit procedure 
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The algorithm starts with a manually created network of several nodes, then using 
preferential attachment and growth algorithms, new nodes are added. 


We introduced an original component, the computation in advance of the number 
of nodes on each degree-level. The preferential attachment rule is followed by 
obeying to the restriction of having the optimal number of nodes per degree. 
Fig. 2 presents an example of a network with 128 nodes, the initial number of 
nodes being mo = 5 and an incremental. growth of one link per step. 


Fig. 2. Graphic representation of a network with 128 nodes 


One can see that poor connected nodes have smaller chances of getting new 
connections. Besides following the repartition law mentioned above, some other 
restrictions (for example those related to cycles and long chains) had to be applied 
in order to make the generated model more realistic and similar to the Internet. A 
more subtle restriction is related to the TTL (Time-to-living) which is a way to 
avoid routing loops in a real Internet. This translates in a restriction for our 
topology — there can be no more that 30 nodes to get from any node to any other 
node. 
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4. Influence of network topology on traffic behavior 


The fractal nature of both traffic and topology of an Internet network and their 
reciprocal influence was tested considering simulated web traffic on the Internet 
SEN based model. After the generation of a huge network model, we have split it 
in several sub-networks (federations) and then we have verified the traffic 
similarities investigating measurement series having fractal properties. Self- 
similarity is a rigorous statistical property..Let assume we have (very long) time 
series data with finite mean and variance (i.e., covariance stationary stochastic 
process). Self-similarity implies a “fractal-like” behavior: no matter what time 
scale is used to examine the data, similar patterns are obtained. The main features 
deduced from self-similarity are: slowly decaying variance, long range 
dependence and non-degenerate autocorrelations. The “variance-time plot” is one 
of the means to test for the slowly decaying variance property. For example, if we 
plot the variance of the sample versus the sample size, on a log-log plot, it results 
for most processes a straight line with slope -1; for self-similar, the line is much 
flatter. Furthermore, the autocorrelation function for the aggregated process is 
indistinguishable from that of the original process. For the simulation of self- 
similar traffic it was used a superposition of ON-OFF sources after a Pareto 
distribution, with 1<a@<2. The Pareto distribution has two parameters, the 
parameter of shape a and the low-cutting parameter B. The Cumulate Distribution 


Function (CDF) Pareto is Foy=1-(4) , and the function of the probability 
xX 


density is po =4(8) for x>f and a>0. Moreover, the parameter a is 
X 


3-a 


relationated with the Hurst parameter H as 4 = 5 


In the simulation the whole network was splitted in subnetworks with at most 40 
nodes. The parameter for simulation of such a subset were the total number of 
nodes/subnetwork is NV = 40, the number of the initial nodes is mp = 5 and a value 
m = 2 (i.e. at each incremental step one add two links in order to maintain a non- 
zero grouping coefficient). For the simulation we have used 32 associated traffic 
sources randomly associated to TCP traffic agents. The value of the shape 
coefficient a was 1,4 which lead to an expected value of H = 0,8. In fig. 3 are 
shown the diagrams of the aggregate number of packets on three time units: 
1 second, 100 milliseconds and 10 milliseconds. The gray color represents the 
zoom. The strong burstiness of the traffic in all three diagrams confirms the 
presence of the self-similarity phenomenon. 


The Hurst parameter, H, for a given sequence was calculated using a number of 
three different estimation methods: the diagram of the rescaled domain, the 
diagram dispersion-time and the periodogram. In theory, the expected value of 
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Hurst is 0.8, but the real results are, in the order in which the methods were 
presented: 0.8115, 0.9761 and 1.1325. Quite the coefficients for the last two 
methods are over evaluated, one can conclude that the tested model presents 
statistical self-similarity. 
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Fig.3. Number of packets/s for three time scales: 1, 0.1 and 0.01 seconds. 
Conclusions 


The advantage of the model proposed here is its flexibility: it offers an universally 
acceptable skeleton for potential Internet models, on which one can build features 
that could lead to further improvements. The model introduced here offers a 
realistic starting point for a general class of network topologies that combine the 
scale-free structure with a precise spatial layout. 


Although the traffic processes in high-speed Internet links exhibit asymptotic self- 
similarity, their correlation structure at short time-scales makes their modeling as 
exact self-similar processes (like the fractional Brownian motion) inaccurate. 
Based on simulations made on the SEN based Internet model we conclude that 
Internet traffic retains its self-similar properties even under high aggregation. 


The experiments have let to the following results: 1) self-similarity is an 
adaptability of traffic in the network and is not based only on the heavy-tailed file- 
size distribution; 2) the scaling region on traffic self-similarity is divided into two 
timescale regimes: short range dependencies (SRD), determined by bandwidth, 
TCP window size and packet size, and long range dependencies (LRD), 
determined by the statistical heavy-tails; 3) in LRD, increasing the bandwidth 
does not improve throughput (or network performance; 4) there is a significant 
advantage in using fractal analysis methods to solve the problem of anomaly 
detection. 
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An accurate estimation of the Hurst parameter for the MIB variables offers a 
valuable abnormality indicator obtained for the bursty variables. Thus, by 
improving the capability of predicting impending network failures, it is possible to 
reduce network downtime and increase network reliability. 
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